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METHOD AND APPARATUS FOR PRODUCING 
PSEUDO- CONSTANT BITS PER PICTURE VIDEO 
BIT -STREAMS FOR LOW-DELAY COMPRESSION SYSTEM 



Technical Field 



The invention relates generally to the field of digital 
video compression, and more particularly, to a facility for 
producing a pseudo-constant bits per picture compressed 
bitstream in real-time video such as interpersonal or 
multimedia communications, e.g., video-conferencing or 
video -telephony, where the end-to-end encoding/decoding 
delay should be low. 



Background of the Invention 



Production and transmission of information has 
undergone drastic changes in recent years. This evolution is 
mainly due to the availability of reliable and sophisticated 
digital communications networks, digital storage media, and 
digital compression specifications which have facilitated 
emission and management of a wide array of digital assets 
such as motion video, image, text, audio, data, and graphic 
information. Motion video due to its widespread application 
in various chains of digital infrastructures and the 
abundant information that it carries, has received 
significant attention from the research and development 
community. As a result, a number of methods have been 
developed to deal with encoding of moving pictures at 
various spatial and temporal sampling rates. These methods 
are intended to elevate the use of digital video in 
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industry, encourage the enhancements of current products, 
and finally accelerate the definition of future products. 

For example, the MPEG-2 international standard formed 
by the Moving Pictures and Expert Group, and described in 
5 ISO/IEC 13818-2, "Information Technology - Generic Coding of 

Moving Pictures and Associated Audio Information: Video, 
1996," which is hereby incorporated herein by reference in 
its entirety, adopts the tool -kit approach of "profiles" and 
"levels" to encompass the need of many factions within the 
10 broadcast, consumer, and entertainment sectors. "Profile" 

defines a subset of tools available to encode a video 
sequence while "level" deals with spatio-temporal resolution 
of a video source. 



The book by B . G. Haskell, A. Puri , and A. N. 
Netravali, Digital Video: An Introduction to MPEG-2 , Chapman 
and Hall, New York, 1997, which is hereby incorporated 
herein by reference in its entirety, explains various 
components of an MPEG-2 encoder in detail. Most digital 
video encoders rely on some form of an image analyzer, such 
as Discrete Cosine Transformation (DCT) , to exploit intra- 
picture pixel-to-pixel redundancies, and motion 
estimation/compensation units to remove the inter-picture 
pixel-to-pixel redundancies. Since hardware realization of 
the above image processing techniques are more practical for 
rectangularly- shaped groups of pixels, the majority of 
specifications for digital video compression adopt a block- 
based approach of processing the image data. 
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A very efficient form of digital video compression is 
achieved by classifying a plurality of pictures into intra- 
coded and predicted (or inter-coded) pictures. For an intra - 
coded picture only the information from the same picture is 
5 used to perform the encoding procedure. On the other hand, 

the image data in inter-coded pictures is predicted by 
displacing information in other pictures within a defined 
search area. The concept of searching for the best 
prediction is known in the art as motion estimation. The 

10 difference of the prediction and the picture is then 

encoded. Therefore, decoding of inter-coded pictures require 
adding the decoded picture-difference to the displaced 
picture. The concept of displacing pictures during the 
decoding procedure is known in the art as motion 

15 compensation. 



The use of motion estimation and motion compensation 
methods in inter-coded pictures helps greatly in reducing 
the amount of consumed bits. For cases where a good 
prediction is not found for a region of a picture, the 

2 0 encoder can revert back to the intra-coded method to carry 

out the compression task for this particular region of the 
picture. An intra versus inter switch can be easily derived 
for the video encoder. For ease of discussion, intra-coded 
pictures are referred as J coded and predicted-coded 

25 pictures are labeled P coded. The aforementioned description 

of a digital video encoder is clear with knowledge of the 
art of video compression. Further it is clear that a 
predicted picture would consume a lot less number of bits 
than an intra-coded picture. This methodology, although very 
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efficient for producing professional quality video, requires 
a large encoder or decoder buffer size and consequently 
imposes a longer system delay. This is because the large 
intra-coded pictures of the bit-stream have to fit in the 
5 decoder buffer and secondly it takes longer for all the bits 

of this type picture to be in the buffer. On the other hand 
J pictures are very useful since they facilitate random 
accessing and further impose a bound on how long a corrupted 
region of the picture would leak into the rest of the 
10 compressed video stream. 



A unique application for any type of digital video 
encoder is in the area of real-time video communications, 
where video-conferencing, video-phone, and monitoring 
compression systems with low encoding/decoding delay can be 
15 realized. Such products require a special set of features in 

order to be practical and cost effective. 



Summary of the Invention 



For digital video products where low encoding/decoding 
delay is of utmost importance, a different encoding strategy 
2 0 should be deployed. This strategy should encourage the use 

of a small buffer size, which is realized in accordance with 
the present invention by producing a near-constant bits per 
picture compressed stream. 

The specification in ISO/IEC 13818-2 describes a 
25 methodology for low delay encoding applications such as in 

visual communications. This method recommends that picture 
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updating, which is typically done by inserting X pictures, 
can be accommodated by only updating a part of the picture. 
The rest of the picture is predicted. Parts of the picture 
which are updated use the same encoding scheme as the one in 
the intra-coded pictures. This mechanism improves the 
resilience of the stream in the presence of possible byte 
corruption or bad prediction. Using this methodology it is 
possible to create a P - only bit-stream, which along with a 
sophisticated bit-allocation scheme should facilitate the 
use of a small buffer size. Most specifications and 
recommendations suggest the updating of a series of pixel 
blocks from left to right (i.e., across a row of blocks) or 
from top to bottom (i.e., down a column of blocks) where the 
updated rows would move from top to bottom and the updated 
columns would move from left to right as the video is 
displayed. This approach would ensure that a badly predicted 
pixel data will not corrupt the rest of the video for ever 
since it will be updated (intra-coded) within a fixed cycle. 
However, the above-described compression strategies and 
prior art dealing with low delay encoding methods do not 
describe a methodology for producing an almost constant bits 
per pictures where the source video is moving from one scene 
to another new scene. 

Thus, described herein are a method and apparatus for 
achieving the requirements of a low delay video encoder. 
These requirements should guarantee that the actual number 
of produced bits in a video bit-stream is close to a 
constant number, specifically when a video shot change is 
detected, and further, the whole picture is updated without 
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motion estimation within a pre-selected number of pictures. 
The present invention is readily applicable to any digital 
video encoder which employs the concept of motion estimation 
and motion compensation. 



5 Briefly summarized, the present invention comprises in 

one aspect a method for processing a sequence of video 
frames. The method includes dynamically encoding the 
sequence of video frames to produce a pseudo-constant bits 
per frame compressed signal at a scene change within the 
10 sequence of video frames. The dynamically encoding 

includes: detecting when a new scene occurs in the sequence 
of video frames; and responsive to the detecting, 
dynamically determining a group of frequency domain pixel 
data to be retained for a frame of the new scene. 



15 In another aspect, a method for processing a sequence 

of video frames is provided which includes dynamically 
encoding the sequence of video frames, where the dynamically 
encoding includes: encoding multiple blocks of a first frame 
of the sequence of video frames in intra- coded mode using a 

20 first orientation for the intra-coded blocks; and encoding 

multiple blocks of a second frame of a sequence of video 
frames in intra-coded mode using a second orientation for 
the intra-coded blocks, wherein the first orientation and 
the second orientation are perpendicular. 



2 5 Systems and computer program products corresponding to 

the above -summarized methods are also described and claimed 
herein. 
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Additional features and advantages are realized through 
the techniques of the present invention. Other embodiments 
and aspects of the invention are described in detail herein 
and are considered a part of the claimed invention. 

Brief Description of the Drawings 

The subject matter which is regarded as the invention 
is particularly pointed out and distinctly claimed in the 
claims at the conclusion of the specification. The above 
objects, advantages and features of the present invention 
will be more readily understood from the following detailed 
description of certain preferred embodiments of the 
invention, when considered in conjunction with the 
accompanying drawings in which: 

FIG. 1 depicts one embodiment of a low-delay digital 
encoder incorporating and using a frequency domain data 
management model for producing pseudo-constant bits per 
pictures at shot changes and an intra updating model in 
accordance with the principles of the present invention; 

FIG. 2 depicts one embodiment of a picture difficulty 
evaluator in accordance with the present invention; 

FIG. 3 is a graph of one embodiment of an N- level 
quantizer for the picture difficulty indicator of FIG. 2, in 
accordance with the principles of the present invention; 
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FIG. 4 depicts one embodiment of a frequency classifier 
and its frequency pattern classes, in accordance with the 
principles of the present invention; 

FIG. 5 depicts one embodiment of a frequency 
5 constrainer in accordance with the principles of the present 

invent ion ; and 

FIG. 6 depicts one embodiment of logic associated with 
one example of disseminating intra-coded blocks of pixels 
throughout a video stream in a pseudo- random fashion in 
10 accordance with one aspect of the present invention. 

Best Mode for Carrying out the Invention 

The present invention recognizes that the conventional 
method of picture updating for low delay applications, i.e., 
the example in ISO/IEC 13818-2, does not provide suitable 

15 video quality. This is due to the fact that intra-coded 

blocks of a picture will always produce less artifacts than 
predicted blocks, and further, the monotonic way of updating 
one row or column as if they are rolling downward or 
sideway, respectively, from picture to picture creates 

2 0 sufficient time for a viewer to comprehend inconsistencies 

in video quality. This quality variation can be described as 
a worm-like phenomena which is more easily detected in video 
sources comprised of lots of image details and a small but 
constant picture velocity. Under this scenario, the intra- 

2 5 coded rows or columns which consume a large portion of the 

picture bit-budget are easily identified. Of course, one can 
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minimize the bit-budget of the intra-coded blocks to ensure 
a more consistent video quality, but this quality sacrifice 
would jeopardize the reliability of a good reference block 
for regions of the picture where motion compensation is to 
5 be performed. 



Therefore, the present invention proposes disseminating 
the intra-coded blocks in a pseudo-random format within the 
picture, thereby offering a more consistent video quality 
than the systematic way of rolling over a series of rows or 

10 columns of blocks of pixels. The low delay encoding approach 

described herein uses an intra versus inter block pattern 
generating scheme to ensure the whole scene is updated after 
a fixed pre-defined number of P pictures. Moreover, the 
formation of the intra-coded blocks are in such a way that 

15 the human eye cannot track down the high fidelity regions of 

a real-time motion video. This is accomplished by forcing 
the scattered intra-coded blocks to move bidirectionally . 
The collection of intra-coded blocks undergoes spatio- 
temporal subsampling. The resultant subsampled grid when 

2 0 overlayed on top of the output from the intra/inter switch 

of the encoder generates video bit -streams which are 
significantly better than previous approaches to low delay 
compression. The approach presented herein does not create a 
visible discontinuity between intra-coded and predicted 

25 blocks of a picture. 



The present invention uses modifications to the rate- 
control algorithm of a digital video encoder to create a 
pseudo-constant bits per picture stream. This is achieved by 
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assigning the same picture type P and the same number of 
bits to each picture of the video source. A frequency- domain 
data management model is implemented for shot changes to 
ensure that all pictures of the compressed stream are 
5 represented with a pseudo-constant number of bits. One 

embodiment of the invention and how the same number of bits 
is substantially achieved throughout the video bit -stream is 
discussed. 



Low Delay Encoding Scheme 



10 FIG. 1 shows one example of a low delay digital video 

encoder 100 which includes an intra updater 110 and a 
frequency domain data management model 12 0 for generating 
pseudo-constant bits per picture in accord with the present 
invention. The intra updater 110 and frequency domain data 

15 management model 12 0 are described in detail further below. 



For a generic low delay encoder, intra-coding of a 
block of pixels is achieved by applying a block-based 
Discrete Cosine Transformer (DCT) 14 0, followed by a Block 
Quantizer (BQ) 150, and then a Variable Length Coder (VLC) 

20 160. The header generation unit 170 is responsible for 

creating video sequence headers and the necessary start 
codes which are in compliance with a given video compression 
standard. The encoder buffer 172 has the responsibility of 
absorbing the picture-to-picture bit-fluctuations (which 

25 should be small for low delay applications) as generated by 

the VLC unit 160 and outputting a constant bits per picture 
compressed stream for transmission over a selected channel . 
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Since the encoder buffer 172 is of finite size, special 
measures have to be accounted for to ensure that buffer 
overflows or underflows do not occur. This is accomplished 
by monitoring the content of the buffer and sending this 
5 information to a picture Rate-Control (RC) model 182 within 

a picture bit-allocation model 180. This RC model will then 
impose certain limits on the picture bits. 

An integral part of any video compression engine is the 
picture bit-allocation model 180 shown in FIG 1. Based on 

10 the desired average bit-rate of the bit-stream, the user 

defines the actual number of bits 183 assigned to each 
picture. Since the output of the encoder is a P - only 
stream, this pre-determined number has a constant value. The 
picture RC model 182 may adjust this selection by 

15 compensating for any deviations from the targeted bit-rate. 

The adjustment factor is derived by comparing the output of 
the picture bits counter 185 against the target bits 
assigned by the picture RC model 182 using the picture bits 
comparator unit 184. Additional adjustments are carried out 

2 0 through a feed-back loop from encoder buffer 172 occupancy. 

The picture bits counter 185 reads in the number of bits 
associated with each VLC codeword to obtain the total number 
of bits for each picture. 

The picture RC model 182 takes several statistical 
25 measures as inputs. These are an activity measure, outputted 

by a block processor 105, an actual picture quantization 
number from the block-based quantizer-modulator (Q- 
modulator) unit 186, and finally an actual picture bit count 



END920000090US1 



-11- 



from picture bits counter unit 185. These parameters, along 
with information collected from the encoder buffer fullness 
and the user-defined picture bits, are used to determine an 
ideal picture bits number for the next picture to be 
5 encoded. The picture RC model 182 will ultimately compute a 

picture quantization value and input this along with the 
ideal picture bits to the block-based Q-modulator 186. The 
role of the block-based Q-modulator 186 is to ensure that 
the final picture count is close to the target picture bits 

10 computed by the picture RC model 182. This task is 

facilitated by the picture bits comparator unit 184 which 
computes the difference between the accumulated actual 
picture count and the properly scaled target picture bits 
after each block of pixels is encoded. The difference number 

15 for the processed blocks along with an encoder buffer 172 

occupancy measure are used to modulate the picture 
quantization value (previously provided by the picture RC 
model) at the block level. Finally, the block-based Q- 
modulator unit 186 sends a nominal quantizer value to the BQ 

20 unit 150 which will implement the quantization of the image 

block. 



For predicted-coding of a block of pixels, in one 
embodiment, the present invention employs the components of 
an intra-picture encoder previously described, plus units 
2 5 such as the motion estimation unit (MEU) and motion 

compensation unit (MCU) . In this mode of operation, two 
consecutive pictures of the input video, i.e., P t and P t+1 , 
are stored in the memory unit 107. For each block of P t+1 , a 
prediction is formed by displacing a block of P t (having the 
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same coordinates as the block of P t+1 ) within a motion 
window, and searching for the best match. This process is 
performed by the MEU 109. It should be noted that a video 
decoder used to decompress the output of the low delay 
5 encoder has only access to the decompressed (or 

reconstructed) pictures. For example, reconstruction of 
picture P t+1 at the decoder output requires reconstruction of 

picture P t which is labeled as P t in memory 107. In order to 

minimize the drift between the reconstructed pictures at the 
10 encoder and decoder sides, consideration should be made to 

displacing the blocks of P t at the encoder side. As a 
result, it is more efficient to perform the motion 
estimation (ME) task in two steps. In the first step, MEU 
109 computes an estimation for each block of P t+1 using a 
15 block of P t . This estimate is uniquely defined by a set of 

motion vectors which describe the displacement of the 
predicted block from its original location in horizontal and 
vertical directions. In the second step, the motion vectors 
of the first step are used as an initial guess to displace a 

A 

2 0 block of P t corresponding to a block of P fc+1 , and finally 

refining it within a motion window to obtain the best 
prediction for P t+1 . Therefore, it is required to store the 

A 

reconstructed picture P t in the memory unit 107. 



It is possible that during the ME task, a good 
25 prediction cannot be found for a block under consideration 

and, hence, it is more advantageous to encode the block as 
intra-coded. This decision is made by an intra/inter decider 
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unit 111. If this unit decides to encode a block in 
intermode, a block of P t corresponding to a source block in 
P t+1 is motion-compensated by MCU 113 using the proper motion 
vectors. The output of MCU is subtracted from the source 
5 block in P t+1 and the resulting block difference , defined as 

motion compensated block difference (MCBD) , is sent to the 
DCT unit as a subtraction 115 from the source signal. Since 
the encoder 100 is also responsible for reconstructing 
pictures, the MCBD blocks are decoded and added to the 

10 output of the MCU unit 113 for forwarding to an adder 194. 

Decoding is comprised of sending the output of BQ unit 150 
to the Inverse Block Quantizer (IBQ) 190 and then to an 
Inverse DCT (IDCT) unit 192. No MCU task is needed for 
reconstructing blocks of the picture encoded in intra-coded 

15 mode. 



1 . Frequency Domain Data Management Scheme For Production 
Of Pseudo-Constant Bits Per Pictures At Shot Changes 

A minimum achievable amount of encode/decode delay in a 
compression system is strongly related to buffer size 

2 0 designed into the system. An aggressive low delay encoder 

should have a very small buffer size. For a steady-state 
motion video where transient changes are minimal, this goal 
is easily achievable. However, if there are sudden changes 
in the transient behavior of the input video (e.g., a shot 

2 5 change) , or if the user decides to input a different source 

(e.g., change a channel in real-time), a small buffer size 
will have trouble dealing with large compressed pictures. In 
this case the decoder buffer will underflow (i.e., overflow 
condition for encoder buffer) . If the compressed picture is 
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too small, the decoder buffer will overflow (i.e., underflow 
condition for the encoder buffer) . In order to circumvent 
such scenarios, a unique data management model in frequency 
domain 120 (FIG. 1) is presented herein for cases where 
there are abrupt changes in transient behavior of input 
video. This model removes any glitches in the perceived 
video that are otherwise caused by buffer overflow or 
underflow of prior low delay encoders at shot (i.e., scene) 
changes. The components of a frequency domain data 
management model in accordance with one embodiment of the 
present invention are described below. This frequency domain 
data management model is geared toward production of pseudo- 
constant bits per picture compressed bit-streams in the 
presence of any form of shot changes . 

1 . 1 Shot-Change Detector 

There are many methods of detecting a shot -change in an 
incoming video stream. One method is to compare the mean of 
luminance and chrominance components of two consecutive 
pictures P t and P t+1 to examine if P t+1 belongs to a new 
scene. Let y t , cb t ) , and cr t (i,j) represent the pixel 

intensities of a YCJbCr digital picture at time t and 
coordinate (i,j) t wherein i and j represent the row and 
column indices, respectively. If m = m r x m c is the number 
of pixels in the luminance component Y of a picture, then 

the luminance picture mean would be y - m ^I||Q"^Z|_^y (j^-j) . 

The number of rows and columns of Y are defined by m r and 
m cf respectively. Similarly one can compute the chrominance 
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picture mean for component Cb as cb t = n 1 ^?io 1 ^o 1 cb t (i,j), 

and for component Cr as = n" 1 ^^ 1 ^^" 1 c^(i,j)with n = n r x n c 

being the number of pixels in the chrominance components. n r 
and n c are the number of rows and columns for a color 
5 component, respectively. The three picture means are 

computed by the pre-processor 118 of FIG. 1 and sent to the 
shot-change detector unit 121. The shot-change detector 121 
will compute an indicator SCI as 



SCI= (a x + a 2 + a 3 ) x (ajy t+1 - y t |+a 2 |cb t+1 - cb t |+a 3 |cr t+i - cr t |) 

10 A typical value for a 1 is 2.0, and for a 2 and a 3 one can 

use 1.0. The shot-change detector unit will then decide if a 
shot change is detected by comparing the value of SCI 
against a pre-determined number (for example, a threshold 
(Th) of Th = 10.0). If SCI > Th, a shot-change is declared 

15 and a signal is sent to the MEU unit 109. The MEU will 

inform the intra/inter decider 111 that the whole picture is 
encoded in intra-coded mode. Shot -change detector 121 will 
also send a signal to a bi-state switch 122 which would 
toggle between frequency-constraining and non-constraining 

20 modes. When a shot change is detected, the frequency- domain 

data management model 120 is informed that for this picture 
frequency-constraining 123 is required and the switch is 
subsequently flipped to a "b" position. For normal video, 
i.e., no presence of shot changes, the bi-state switch is in 

25 "a" position. It should be noted that the very first picture 

of a video source is always treated as a shot change and its 
encoding task follows the same rules applied to pictures 
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that are declared as new scenes within the stream. 

1.2 Picture Difficulty Evaluator 

When a shot-change is detected, the difficulty of the 
picture in the new scene is evaluated by assessing a set of 
5 picture-based statistical measures. A picture is defined as 

being difficult if it is composed of lots of dissimilar 
image structures. Examples of image structures are textures, 
edges, spatial details, and color bursts. A picture with 
many local image structures will yield frequency 

10 coefficients which are oriented in different directions and 

have modest to large amplitudes upon DCT implementation. 
Therefore, many VLC codewords are required to represent the 
picture in compressed format which in turn will use a large 
amount of bits. Such a large picture may not fit in the 

15 encoder or the decoder buffer. On the other hand, the least 

difficult picture will have few fine details, if any. For 
this picture, the degree of sharpness in edges or the 
intensity in colors are significantly reduced. 

In the present invention, the pre-processor 118 of FIG. 

20 1 will perform a set of inter-pixel calculations on an input 

picture P t . These calculations are carried out in four 
directions: horizontal, vertical, southwest to northeast 
diagonal, and southeast to northwest diagonal. Since picture 
P t can be interlaced or progressive in nature, all inter- 

25 pixel calculations have to be done for both picture formats. 

The syntax of most digital video encoders permits the 
compression to be implemented in interlaced or progressive 
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input mode. Further, pixel processing of an interlaced 
picture (which is composed of two fields) , can be done in 
frame or field format. This is typically referred to as 
frame or field encoding and is obvious to someone who is 
5 familiar with the art of digital video compression. Further, 

it should be obvious that an interlaced frame is comprised 
of two interleaved fields sampled at different times. 
Therefore, if an interlaced frame is decomposed into two 
fields, two pictures in field formats having half the 
10 resolution of the interlaced frame are formed. 



For purposes of discussion, the following definitions 
apply: the frame-based horizontal inter-pixel differences is 
defined as Z hf frame-based vertical inter-pixel differences 
as Z F/Vf frame-based 45° diagonal inter-pixel differences as 

15 ^F f d45r arid frame-based 135° diagonal inter-pixel differences 

as Z Ftdl35 . The field-based inter-pixel differences for 
horizontal, vertical, 45° diagonal, and 135° diagonal are 
defined as Z h/ Z f/Vf Z f/d45f and Z f/dl35 , respectively. It should 
be noted that for either frame or field processing, the task 

20 of horizontal inter-pixel differencing remains in tact since 

pixel data in the same memory locations will be fetched. For 
frame encoding mode of an interlaced picture, statistical 
measures are performed on both frame and field formats and a 
set of inter-pixel indicators are fed to the picture 

25 difficulty evaluator 124 of FIG. 2. For encoding of 

progressive pictures, all inter-pixel indicators are frame- 
based. For both progressive and interlaced sources, picture 
P t is stored in the memory unit 107 of FIG.l in a frame 
format. Parameter Z h for picture P t is calculated as: 
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ILlj. J. 1LL C 

Z h = (g 1 + g 2 + g 3 r 1 ( (m- m,)- 1 ^ I g^Uo ) - y t (i,j+ D| 

1=0 j=0 

n r -l n c -2 

+ (n-n r ) _1 £ £ g 2 |cb t (i,j)-cb t (i,j+l)| 

i=0 j=0 
n r -l n c -2 

-f(n-n r ) _1 J X g 3 |ci^(i,j)-cr t (i f j+l)|) 



i=0 j=0 



Other frame-based statistical measures for either 
interlaced or progressive pictures are calculated as: 



z F ,v = (gi + g 2 + g3r 1 ( ( m - m c) _1 S Z giy t ( i ^)-y t ( i+1 ^) 



j=0 i=0 



*c r 

+ ( n - nj" 1 X E 9 2 F b t ( i f j ) - cb t ( i + 1 , j )| 



(3) 



j=0 i=0 



+(n-n c ) _1 £ £g 3 |cr t (i,j)-cr t (i + l,j )|) 

j=0 i=0 
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_1 _i _i m r _1 ra c -2 

Z F,d45 = (g l + g 2 + g 3> (( m r -1 ) ( m c _1 ) 2 Q g l V^ 1 '^ - y t (i " + 1] 



, n r -l n c -2 



_]_ -1 

+ (n r -l) (n c -l) I I g 9 cb t (i,j)-cb 1 .(i-l,j+l) 

i=l ]=0 L L L 

n r -l n c -2 

+ (n r -l) (n c -l) ,I q g 3 cr t (i / j)-crt(i-l,j+l)|) 



m r -2 m c -2 



_]_ L ^ 

z F,dl35" ( 9l f 92 + 5 3 ) ((m r -l) (m c -l) ( J q ,I q g x y t (i,j )- y t (i + 1) 

(5) 



,n r -2 n c -2 



1 _ 1 i- 

+ (n r -l) (n c -l) L I g?cb t (i,j)-cb t {i+l,j+l) 

1=0 j=0 z L L 

n r -2 n c -2 

+(n r -l) (n c -l) I I g.c^(i,j)-c^(i+l,j+l)) 

i=0 j=0 J L L 



Field-based statistical measures for interlaced 
pictures, where frame encoding mode is considered, are 
calculated as: 



top 



l*f,d45 



+ f 0 Z: 



bot 



•2^f,d45 



-■f,d4 5 



(7) 
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with f 2 = 1.0 and f 2 = 1.0, and top and bot representing top 
and bottom fields of an interlaced frame, respectively. 
Each line of the top field of an interlaced frame is 
spatially located above a line of the bottom field of the 
same frame. Components of equations (6), (7) and (8) can be 
computed as : 



x 1 m r /2-2 

Zf fV = (g 1 + g 2 + g 3 f ((m/2-m c f X Q J Q g 2 Y t (2i+o x ,j)-y t (2(i+l)+o x ,j) 



nc-ln r /2-2 

+ (n/2- n r ) I I g 0 
1 c j=0 i=0 2 



cb t (2i+o x ,j)-cb t (2(i+l)+o x ,j) 



n c -l n r /2-2 



+ (n/2-n c ) ^ g 3 c^(2i+o x ,j)-cr t (2(i+l)+o x ,j)|) 
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Z f,d45 = <9l + 92 + g3> -1 



m r /2-lm c -2 



((m r /2-l) (m c -l) ^ ,E q g 1 y t {2i+o xf j)-y t (2(i»l)+o xf j+l) 



(10) 



n r /2-ln c -2 

+ (n r /2-l) (n c -l) X I g 2 
r c i=l j=0 z 



cb t (2i+ o x ,j )- cb t (2 (i - 1) + o x ,j + 1) 



n r /2-ln c -2 

+ (n r /2-l) (n c -l) £^ X Q g 3 cr t (2i+o x/ j)-cr t (2(i-l)+o x ,j+l)|) 



x -1 
Z f,dl35 = { 9l + 92 + 93> 



_ m r /2-2m c -2 

x((m r /2-ir i (m c -l)" 1 E I g« 
r c 1=0 j=0 1 



-1, 



n r /2-2n c -2 

+ (n r /2-l) ^(n c -l)- L I I g 2 

i=0 j=Q L 

n r /2-2n c -2 

+ (n r /2-l) '(nc-l) £ I g 3 

i=0 ]=0 ° 



y t (2i+o x ,j)- y t (2(i+l)+o x ,j+l 
cb t (2i+ o x ,j )-cb t (2 (i + 1) + o x , j + 1 
0^(21 + o x ,j)- 0^(2 (i+l)+o x ,j+l)|) 



111) 



where x represents the type of field, i.e., top or bot, and 



for x = top, 0 X = 0 t = 0 and for x = bot, 0 X = 0 bot = 1. 



For 



the case where the encoder is set in the field encoding 
mode, each picture is stored in the memory unit of FIG.l as 
a field. In this case, all inter-pixel statistical measures 
are computed using equations (2), (3), (4), and (5) with m r 
and n r taking on the field resolutions for luminance and 
chrominance components, respectively. Finally, an example 
for values of grs are: gr 2 =2,0, gr 2 =1 -°/ an< 3 gr 3 =l . 0 . 



10 



In accordance with one embodiment of the present 
invention, a set of picture-based statistical measures {Z h , 
Z F v/ Zf f v, %F,d45' %f,d45' %F,di3ss 3.nd Z ffdl35 ) are fed to the 
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picture difficulty evaluator 124 of FIG. 2. Depending on 
the encoder's mode of operation or the nature of the input 
source, a sub-set of statistical indicators are computed and 
sent to the difficulty measure comparator. For example, if 
5 the user knows the source is progressive, only Z h/ Z F/V/ 

Z F/d45 ,and Z Ftdl35 are calculated with the proper frame 
resolutions, and all switches 125 corresponding to these 
indicators are turned on in FIG, 2. If the source is 
interlaced and the user sets the encoder in field encoding 
10 mode, again the indicators Z hf Z F/Vt Z Ftd45 , and Z Ffdl35 are 

calculated, this time with field resolutions. For the 
aforementioned cases a final statistical measure Z max is 
obtained by a difficulty measure comparator 126 such that: 

Z max = MAX ( Z h^F,v^ F ^ 45 ,Z Ffdl35 ) (12) 

15 

If the user sets the encoder for frame encoding mode of 
an interlaced source, all seven inputs to picture difficulty 
evaluator of FIG. 2 are present and computed, i.e., Z hf 

z f,v* z f,vf Z F,d45r z f,d45' Z F,di35r and Z f t ai35- This means that all 
20 switches 125 to picture difficulty evaluator 124 of FIG. 2 

are now turned on. In this case, Z max is obtained by select 
maximum number logic 127 as: 

Z max = ^ x <V min < Z F,v, z f,v>' m ^^ < 13 ) 

Parameter Z max indicates how difficult a picture is and 
25 further, the most difficult pictures (example, large Z max 

values) will likely consume the most amount of bits. 
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Considering a broad class of video sequences, Z max can 
potentially possess a wide range. In order to classify 
every video shot, a mapping technique is employed which 
forms a dependency between the picture-based statistical 
5 measure Z max and a level of encoding difficulty. The number 

of levels are finite, and therefore, every possible value of 
z max can be mapped into a level. The mapping function is 
defined by the N-level quantizer Q{Z max ) 128 and incorporate 
this into the picture difficulty evaluator 124 of FIG. 2. 

10 One embodiment of the mechanism of the N-level quantizer 128 

is depicted in FIG. 3. Every value of Z max is fed to the N- 
level quantizer and a parameter defined as D f is provided 
(see FIG. 2) as output. As one example, a value of N=13 is 
used for the Q{Z max ) quantizer of FIG. 3, but any number of 

15 levels greater than or equal to two could be derived for N 

in accordance with the present invention. 

The quantizer of FIG. 3 will operate on Z max and compute D f 
through 



D f = 



Li 



INT 



if 2 max >T 2 



(14) 



T 2 -T, 



+ T 2 L 1 ~ T 1 L N + 



otherwise 



20 Where INT( .) denotes the largest integer number which is 

smaller than the argument of the function. Threshold 
parameters of equation (14) are T 1 = 2 and T 2 = 11 and the 
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limits on levels L 2 - 1, L N = 13 . Parameter (1 + a)" 1 
controls the positions of the centroids of step sizes of the 
quantizer function along the Z max axis. A large value for 
this parameter will shift the centroids to the left and a 
5 smaller value will have an opposite impact. The quantizer 

of FIG. 3 is drawn with a = 1. This means that we are more 
biased toward declaring pictures as difficult. Finally, the 
dashed line of FIG. 3 is a representation of the argument of 
the INT{.) function with no adjusting parameter 
10 (1 + a)" 1 . 



1 . 3 Frequency Classifier 



FIG. 4 displays one possible way of partitioning the 
frequency coefficients of an 8 x 8 DCT block 400 of a 
picture into different pattern classes 410. Since the 

15 coefficients are oriented such that their significance 

decreases from left to right and top to bottom, the 
partitioning strategy should favor the most significant 
values located near the top left of the 8x8 block 400 and 
other classes are formed by expanding into the next set of 

20 coefficients. The approach of FIG. 4 uses 13 pattern 

classes 420 and each class takes the shape of a right-angle 
triangle. Other number of pattern classes or other 
formations such as squares or rectangles or any other shape 
could be used in accordance with the present invention. 



25 Difficulty measure D f is sent to the frequency 

classifier 440 of FIG. 4 and matched against a look-up table 
450. The look-up table has a number of frequency pattern 
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classes 420 in store. A pattern S L is selected 460 by the 

frequency classifier 440 such that D f = L ± . Pattern classes 
are indexed so that the lowest order class is associated 
with the least difficult picture (example, D f = L 2 ) , and 
5 will carry more DCT coefficients throughout the encoding 

procedure than the most difficult picture (example, D f = 
L N ) corresponding to the highest order class . 

1 . 4 Frequency Constrainer 

The selected pattern S L along with the DCT 

nlO coefficients are sent to the frequency constrainer 123 of 

J y FIG. 5 after a scene change is detected (switch 122 of 

M: FIG. 1 is in "b" position) . If the coefficients belong to 

the set S L , they will be kept 510 , otherwise they will be 

111 discarded 520. Therefore, a constrained set of DCT 

L.. 15 coefficients is passed through the frequency constrainer 123 

and fed to the BQ unit 150 (FIG. 1) for quantization. Such 
hi a difficult picture can yield many number of bits in 

S compressed form, a chosen pattern such as S T _ Or S T leads 

ij 12 

to an aggressive frequency constraining which in turn 
2 0 contributes to providing a pseudo-constant bits per picture 

video bit -stream. If the encoder does not use any 
constraining mechanism, difficult pictures would cause the 
decoder buffer to underflow. 



END920000090US1 



-26- 



1 . 5 Zero-Bytes Generator 



For some shot -changes, where the new scene is composed 
of very easy material such as black or grey pictures, the 
5 use of the most conservative frequency pattern classes may 

not result in compressed picture sizes which are close to 
the nominal value of the user-defined average picture bits 
R a of the bit-stream. For these scenarios, a zero-byte 
generation mechanism is adopted to circumvent the decoder 
10 buffer from overflowing. After the final picture count, the 

actual value of picture bits R r is supplemented with a 
number of zero bytes equivalent to: 

(15) 

R z = (Ra" g d )-Rr 

15 The nominal values of R a and R r are fed to the zero- 

byte generator 13 0 of the frequency domain data management 
model of FIG. 1 and R z zero bytes are computed according to 
equation (15) and sent to the VLC unit 160. Zero bytes are 
stuffed at the end of the picture in the compressed bit- 

2 0 stream. Constant value g d is a user-defined number to 

control the number of zero bytes. A user with a large 
tolerance for picture bits fluctuations in low delay mode of 
operation may wish to use a larger gr d> For applications 
where fluctuations are not tolerated, g d should be zero. A 

25 typical value for most applications is g d = 64 bits. 

2-3. Intra Updater 

The intra updater 110 (FIG. 1) of the present invention 
adopts a unique approach to block coding in intra mode which 
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is different than prior attempts to updating regions of 
pictures. Most of the art related to low delay encoding 
employs a systematic way of sweeping through the pictures of 
the input video. Here, it is guaranteed that the whole 
5 picture is updated by unidirectionally moving large blocks 

of intra-coded pixels. This approach, although simple and 
easy to implement, introduces a disturbing effect in the 
quality of the video stream. The intra-coded image blocks 
are viewed as if they are raised out of the surface of the 
10 video screen. This discontinuity phenomena, caused by not 

uniformly distributing block artifacts, is easily witnessed 
by a viewer. 



The approach of the present invention to intra updating 
uses a mechanism to disseminate blocks of intra-coded pixels 

15 throughout the picture and thereby provides a more feasible 

approach to uniform distribution of compression artifacts. 
Further, the orientation of scattered intra-coded blocks is 
changed herein for every picture to minimize the impact of 
the encoding distortions. This results in alternating 

2 0 between blocks oriented in the northwest -southeast 

directions and blocks oriented in the northeast -southwest 
direction which move in opposite directions. Each 
orientation is composed of two classes of decimated diagonal 
intra-coded blocks which are equally spaced along a path 

25 perpendicular to their orientation and cover the surface of 

the picture. 
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FIG. 6 shows one embodiment of an intra updater where 

after 2 number of pictures, the whole picture is updated 
using a block-based intra-coding approach. The block 

processor 105 of FIG. 1 provides updating parameter U f =2 , 

5 picture number K, row block index B It and column block index 

Bj to intra updater 110 of FIG. 6. If K represents the 
first picture 600, a cycle counter 610 defined as c is set 
at zero to denote the beginning of a cycle. A binary 
representation of B x is AND gated 62 0 with 1 and the output 
10 is defined as Jb 2 . If B x indicates 630 an even row of the 

picture (example, output of AND gate is Jb 2 = 0) , then a 
block address G is computed 64 0 as: 

(16) 

G ^ Bj +Bj +c 

15 otherwise, for an odd row of the picture (example, Jb 2 ** 0) , G 

is computed 650 as : 

(17) 



G = 



B , — B — c 
J 1 



Binary representations of G and U f - 1 665 are AND 
20 gated 660 and output b 2 is compared 670 with zero. If the 

result is zero, i.e., G is a multiple of U ff then the block 
under process is declared an intra block 68 0 and the 
information is sent to the intra/inter decider 111 and MEU 
units 109 of FIG. 1. Otherwise, the intra/inter decider 111 
25 will determine the modality of the encoded block. For cases 

where K is not the first picture, it will be examined to see 
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if the picture is within the cycle of updating or not. This 
is done by testing the equality c + 1 = U f 690. If the 
equality condition holds true, one whole cycle is processed 
and counter c is re -set to zero 610 denoting that a new 
5 cycle is about to begin. Otherwise counter c is incremented 

by one 700. For both cases c is fed to the block that 
compute s the addre s s G . 

The present invention can be included, for example, in 
an article of manufacture (e.g., one or more computer 

10 program products) having, for instance, computer usable 

media. This media has embodied therein, for instance, 
computer readable program code means for providing and 
facilitating the capabilities of the present invention. The 
articles of manufacture can be included as part of the 

15 computer system or sold separately. 

Additionally, at least one program storage device 
readable by machine, tangibly embodying at least one program 
of instructions executable by the machine, to perform the 
capabilities of the present invention, can be provided. 

20 The flow diagrams depicted herein are provided by way 

of example. There may be variations to these diagrams or 
the steps (or operations) described herein without departing 
from the spirit of the invention. For instance, in certain 
cases, the steps may be performed in differing order, or 

25 steps may be added, deleted or modified. All of these 

variations are considered to comprise part of the present 
invention as recited in the appended claims. 
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While the invention has been described in detail herein 
in accordance with certain preferred embodiments thereof, 
many modifications and changes therein may be effected by 
those skilled in the art. Accordingly, it is intended by 
5 the appended claims to cover all such modifications and 

changes as fall within the true spirit and scope of the 
invention. 



END920000090US1 



-31- 



Claims 



1 1. A method for processing a sequence of video 

2 frames, said method comprising: 

3 dynamically encoding said sequence of video frames 

4 to produce a pseudo-constant bits per frame compressed 

5 signal at a scene change within said sequence of video 

6 frames, said dynamically encoding comprising: 

7 detecting when a new scene occurs in the 

8 sequence of video frames; and 

9 responsive to said detecting, dynamically 

10 determining a group of frequency domain pixel data 

11 to be retained for a frame of the new scene. 

1 2. The method of claim 1, wherein said dynamically 

2 determining comprises determining a level of frequency 

3 domain pixel data to be retained from multiple predefined 

4 levels, and wherein said determining determines the level of 

5 frequency domain pixel data to be retained for an initial 

6 frame of the new scene. 

1 3. The method of claim 2, wherein said level of 

2 frequency data to be retained is associated with a frequency 

3 constraining pattern, and said determining comprises 

4 selecting a frequency constraining pattern to be employed 

5 from a plurality of frequency constraining patterns 

6 associated with said multiple predefined levels. 
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1 4. The method of claim 3, wherein said plurality of 

2 frequency patterns comprise a common geometrical shape, and 

3 wherein said common geometrical shape of said plurality of 

4 frequency patterns can be one of a plurality of common 

5 geometrical shapes . 

1 5. The method of claim 3, wherein at least one most 

2 significant frequency pixel is included by each of the 

3 plurality of frequency constraining patterns. 

1 6. The method of claim 1, wherein said dynamically 

2 determining comprises determining said group of frequency 

3 domain pixel data to be retained for said frame of the new 

4 scene by evaluating picture difficulty of the new scene. 

1 7. The method of claim 6, wherein said dynamically 

2 determining further comprises ascertaining picture 

3 difficulty indicators representative of picture difficulty 

4 of the new scene; wherein said picture difficulty indicators 

5 are ascertained by computing pixel-to-pixel differences in 

6 at least some of horizontal, vertical, and diagonal 

7 directions . 

1 8. The method of claim 7, wherein said picture 

2 difficulty indicators are ascertained by computing pixel -to- 

3 pixel differences in each of said horizontal, vertical, and 

4 diagonal directions. 
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1 9. The method of claim 7, wherein said ascertaining 

2 comprises determining a maximum indicator for a frame 

3 picture of a progressive video source or a field picture of 

4 an interlaced video source, said maximum indicator being 

5 determined by comparing said picture difficulty indicators 

6 to each other. 

1 10. The method of claim 7, wherein said ascertaining 

2 comprises ascertaining picture difficulty indicators in 

3 vertical and diagonal directions for both top and bottom 

4 fields of a frame of an interlaced video source, and picture 

5 difficulty indicators in vertical and diagonal directions 

6 for the frame of the interlaced video source, and wherein 

7 said ascertaining further comprises ascertaining field-based 

8 indicators in vertical and diagonal directions by computing 

9 a weighted summation of individual top and bottom field 

10 indicators having a same direction, and wherein for each 

11 vertical and diagonal direction, a picture difficulty 

12 indicator is determined by ascertaining a minium number 

13 between a corresponding field-based indicator and a frame - 

14 based indicator derived from the same frame of the 

15 interlaced video source. 

1 11. The method of claim 10, wherein said ascertaining 

2 comprises selecting said picture difficulty indicators by 

3 determining a maximum indicator of the ascertained vertical 

4 and diagonal indicators, as well as a horizontal indicator. 
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1 12. The method of claim 9, wherein said ascertaining 

2 further comprises mapping the maximum indicator to a level 

3 of an n- level quantizer if the value of the maximum 

4 indicator is between predefined thresholds, and mapping the 

5 maximum indicator to a constant number if the indicator is 

6 outside of said predefined thresholds. 

1 13. The method of claim 12, further comprising 

2 employing said mapping to identify an address of a frequency 

3 pattern in a look-up table, said look-up table containing a 

4 plurality of frequency patterns, and wherein said 

5 determining comprises selecting one frequency pattern of 

6 said plurality of frequency patterns. 

1 14 The method of claim 13, wherein when the maximum 

2 indicator has a large nominal value it is re-mapped into a 

3 frequency pattern comprising a lesser number of frequency 

4 coefficients than a number of coefficients of a frequency 

5 pattern corresponding to when the maximum indicator has a 

6 smaller nominal value. 

1 15. The method of claim 14, wherein said plurality of 

2 frequency patterns are indexed such that a population of one 

3 frequency pattern is a subset of a population of a frequency 

4 pattern with a lower index number. 
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1 16. The method of claim 14, wherein said determining 

2 comprises comparing each frequency coefficient of a block 

3 with respect to said selected frequency pattern, and if the 

4 coefficient belongs to the frequency pattern, the 

5 coefficient is retained as part of said group of frequency 

6 domain pixel data. 

1 17. The method of claim 1, wherein after a final frame 

2 count, if the actual frame bits is smaller than the 

3 difference of a predefined number and a guard band value, 

4 the difference is computed and a number of zero bytes 

5 according to this difference is added to the final picture 

6 count to ensure said pseudo-constant bits per frame 

7 compressed signal. 

1 18. The method of claim 1, wherein said method is 

2 implemented within an MPEG encoder. 

1 19. The method of claim 1, wherein said dynamically 

2 encoding further comprises encoding said frame of the new 

3 scene as a intra-coded frame. 
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20. A method for processing a sequence of video 
frames, said method comprising: 

dynamically encoding said sequence of video 
frames, said dynamically encoding comprising: 

encoding multiple blocks of a first frame of 
the sequence of video frames in intra- coded mode 
using a first orientation for said intra-coded 
blocks; and 

encoding multiple blocks of a second frame of 
the sequence of video frames in intra-coded mode 
using a second orientation for said intra-coded 
blocks, wherein said first orientation and said 
second orientation are perpendicular. 

21. The method of claim 20, wherein said first 
orientation comprises a first diagonal orientation, and said 
second orientation comprises a second diagonal orientation. 

22. The method of claim 21, wherein said first frame 
and said second frame comprise adjacent frames in said 
sequence of video frames. 

23. The method of claim 21, wherein said multiple 
intra-coded blocks of said first frame are scattered 
throughout said first frame, and wherein said multiple 
intra-coded blocks of said second frame are scattered 
throughout said second frame. 
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1 24. The method of claim 21, wherein said multiple 

2 intra-coded blocks of said first frame are equally spaced 

3 along a direction perpendicular to said first orientation of 

4 said multiple intra-coded blocks, and wherein said multiple 

5 intra-coded blocks of said second frame are equally spaced 

6 along a direction perpendicular to said second orientation 

7 of the multiple intra-coded blocks. 

1 25. The method of claim 21, wherein said encoding said 

2 multiple blocks of said first frame in intra-coded mode 

3 using said first orientation comprises imposing different 

4 sub-sampling rates on said multiple intra-coded blocks along 

5 said first orientation, and wherein said encoding multiple 

6 blocks of said second frame in intra-coded mode using said 

7 second orientation comprises imposing different sub-sampling 

8 rates on said multiple intra-coded blocks along said second 

9 orientation. 

1 26. The method of claim 21, wherein said sequence of 

2 video frames comprises a plurality of even numbered frames 

3 and a plurality of odd numbered frames, and wherein said 

4 first frame comprises one frame of said plurality of even 

5 numbered frames and said second frame comprises one frame of 

6 said plurality of odd numbered frames, and wherein intra - 

7 coded blocks of said plurality of even numbered frames move 

8 along said first orientation in a direction opposite from 

9 that of intra-coded blocks moving along said second 

10 orientation within said plurality of odd numbered frames. 
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27. A system for processing a sequence of video 
frames, said system comprising: 



3 an encoder for dynamically encoding said sequence 

4 of video frames to produce a pseudo-constant bits per 

5 frame compressed signal at a scene change within said 

6 sequence of video frames, said encoder comprising: 

7 means for detecting when a new scene occurs 

8 in the sequence of video frames; and 

9 means for dynamically determining a group of 

10 frequency domain pixel data to be retained for a 

11 frame of the new scene responsive to said 

12 detecting of the new scene. 

1 28. The system of claim 27, wherein said means for 

2 dynamically determining comprises means for determining a 

3 level of frequency domain pixel data to be retained from 

4 multiple predefined levels, and wherein said means for 

5 determining determines the level of frequency domain pixel 

6 data to be retained for an initial frame of the new scene. 
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1 29. The system of claim 28, wherein said level of 

2 frequency data to be retained is associated with a frequency 

3 constraining pattern, and said means for determining 

4 comprises means for selecting a frequency constraining 

5 pattern to be employed from a plurality of frequency 

6 constraining patterns associated with said multiple 

7 predefined levels. 

1 30. The system of claim 29, wherein said plurality of 

2 frequency patterns comprise a common geometrical shape, and 

3 wherein said common geometrical shape of said plurality of 

4 frequency patterns can be one of a plurality of common 

5 geometrical shapes . 

1 31. The system of claim 29, wherein at least one most 

2 significant frequency pixel is included by each of the 

3 plurality of frequency constraining patterns. 

1 32. The system of claim 27, wherein said means for 

2 dynamically determining comprises means for determining said 

3 group of frequency domain pixel data to be retained for said 

4 frame of the new scene by evaluating picture difficulty of 

5 the new scene. 



END920000090US1 



-40- 



1 33. The system of claim 32, wherein said means for 

2 dynamically determining further comprises means for 

3 ascertaining picture difficulty indicators representative of 

4 picture difficulty of the new scene, wherein said picture 

5 difficulty indicators are ascertained by computing pixel-to- 

6 pixel differences in at least some of horizontal, vertical, 

7 and diagonal directions. 

1 34. The system of claim 33, wherein said picture 

2 difficulty indicators are ascertained by computing pixel -to- 

3 pixel differences in each of said horizontal, vertical, and 

4 diagonal directions. 

1 35. The system of claim 33, wherein said means for 

2 ascertaining comprises means for determining a maximum 

3 indicator for a frame picture of a progressive video source 

4 or a field picture of an interlaced video source, said 

5 maximum indicator being determined by comparing said picture 

6 difficulty indicators to each other. 



END920000090US1 



-41- 



1 36. The system of claim 33, wherein said means for 

2 ascertaining comprises means for ascertaining picture 

3 difficulty indicators in vertical and diagonal directions 

4 for both top and bottom fields of a frame of an interlaced 

5 video source, and picture difficulty indicators in vertical 

6 and diagonal directions for the frame of the interlaced 

7 video source, and wherein said means for ascertaining 

8 further comprises means for ascertaining field-based 

9 indicators in vertical and diagonal directions by computing 

10 a weighted summation of individual top and bottom field 

11 indicators having a same direction, and wherein for each 

12 vertical and diagonal direction, a picture difficulty 

13 indicator is determined by ascertaining a minium number 

14 between a corresponding field-based indicator and a frame- 

15 based indicator derived from the same frame of the 

16 interlaced video source. 

1 37. The system of claim 36, wherein said means for 

2 ascertaining comprises means for selecting said picture 

3 difficulty indicators by determining a maximum indicator of 

4 the ascertained vertical and diagonal indicators, as well as 

5 a horizontal indicator. 

1 38. The system of claim 35, wherein said means for 

2 ascertaining further comprises means for mapping the maximum 

3 indicator to a level of an n-level quantizer if the value of 

4 the maximum indicator is between predefined thresholds, and 

5 for mapping the maximum indicator to a constant number if 

6 the indicator is outside of said predefined thresholds. 
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1 39. The system of claim 38, further comprising means 

2 for employing said mapping to identify an address of a 

3 frequency pattern in a look-up table, said look-up table 

4 containing a plurality of frequency patterns, and wherein 

5 said means for determining comprises means for selecting one 

6 frequency pattern of said plurality of frequency patterns. 

1 40. The system of claim 39, wherein when the maximum 

2 indicator has a large nominal value it is re-mapped into a 

3 frequency pattern comprising a lesser number of frequency 

4 coefficients than a number of coefficients of a frequency 

5 pattern corresponding to when the maximum indicator has a 

6 smaller nominal value. 

1 41. The system of claim 40, wherein said plurality of 

2 frequency patterns are indexed such that a population of one 

3 frequency pattern is a subset of a population of a frequency 

4 pattern with a lower index number. 

1 42. The system of claim 40, wherein said means for 

2 determining comprises means for comparing each frequency 

3 coefficient of a block with respect to said selected 

4 frequency pattern, and if the coefficient belongs to the 

5 frequency pattern, the coefficient is retained as part of 

6 said group of frequency domain pixel data. 
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1 43. The system of claim 27, wherein after a final 

2 frame count, if the actual frame bits is smaller than a 

3 difference of a predefined number and a guard band value, 

4 said system further comprises means for computing the 

5 difference and adding a number of zero bytes according to 

6 this difference to the final picture count to ensure said 

7 pseudo-constant bits per frame compressed signal. 

1 44. The system of claim 27, wherein said encoder 

2 comprises an MPEG encoder. 

1 45. The system of claim 27, wherein said means for 

2 dynamically encoding further comprises means for encoding 

3 said frame of the new scene as a intra-coded frame. 
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1 4 6. A system for processing a sequence of video 

2 frames, said system comprising: 

3 an encoder for dynamically encoding said sequence 

4 of video frames, said encoder comprising: 

5 means for encoding multiple blocks of a first 

6 frame of the sequence of video frames in an intra- 

7 coded mode using a first orientation for said 

8 intra- coded blocks; and 

9 means for encoding multiple blocks of a 

10 second frame of the sequence of video frames in 

11 intra-coded mode using a second orientation for 

12 said intra-coded blocks, wherein said first 

13 orientation and said second orientation are 

14 perpendicular. 

1 47. The system of claim 46, wherein said first 

2 orientation comprises a first diagonal orientation, and said 

3 second orientation comprises a second diagonal orientation. 

1 48. The system of claim 47, wherein said first frame 

2 and said second frame comprise adjacent frames in said 

3 sequence of video frames . 
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1 49. The system of claim 47, wherein said multiple 

2 intra- coded blocks of said first frame are scattered 

3 throughout said first frame, and wherein said multiple 

4 intra-coded blocks of said second frame are scattered 

5 throughout said second frame. 

1 50. The system of claim 47, wherein said multiple 

2 intra-coded blocks of said first frame are equally spaced 

3 along a direction perpendicular to said first orientation of 

4 said multiple intra-coded blocks, and wherein said multiple 

5 intra-coded blocks of said second frame are equally spaced 

6 along a direction perpendicular to said second orientation 

7 of the multiple intra-coded blocks. 

1 51. The system of claim 47, wherein said means for 

2 encoding said multiple blocks of said first frame in intra- 

3 coded mode using said first orientation comprises means for 

4 imposing different sub-sampling rates on said multiple 

5 intra-coded blocks along said first orientation, and wherein 
G said means for encoding multiple blocks of said second frame 

7 in intra-coded mode using said second orientation comprises 

8 means for imposing different sub- sampling rates on said 

9 multiple intra-coded blocks along said second orientation. 
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1 52. The system of claim 47, wherein said sequence of 

2 video frames comprises a plurality of even numbered frames 

3 and a plurality of odd numbered frames, and wherein said 

4 first frame comprises one frame of said plurality of even 

5 numbered frames and said second frame comprises one frame of 

6 said plurality of odd numbered frames, wherein intra-coded 

7 blocks of said even numbered frames move along said first 

8 orientation in a direction opposite from that of intra-coded 

9 blocks moving along said second orientation within said 
10 plurality of odd numbered frames. 
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1 53 . A system for processing a sequence of video 

2 frames, said system comprising: 

3 an encoder for dynamically encoding said sequence 

4 of video frames to produce a pseudo-constant bits per 

5 frame compressed signal at a scene change within said 

6 sequence of video frames, said encoder comprising a 

7 frequency domain data management unit, said frequency 

8 domain data management unit comprising: 

9 a scene-change detector for detecting when a 

10 new scene occurs in the sequence of video frames; 

11 a picture difficulty evaluator for evaluating 

12 picture difficulty of the new scene; 

13 a frequency classifier and constrainer for 

14 dynamically determining a group of frequency 

15 domain pixel data to be retained for a frame of 

16 the new scene responsive to said detecting of the 

17 new scene and complexity of the picture as 

18 determined by said picture difficulty evaluator. 



END920000090US1 



-48- 



1 54 . A system for processing a sequence of video 

2 frames, said system comprising: 

3 an encoder for dynamically encoding said sequence 

4 of video frames, said encoder comprising: 

5 an intra-updater unit for assigning which 

6 blocks of a plurality of blocks of a frame are to 

7 be intra-coded, wherein said intra-updater unit 

8 facilitates : 

9 encoding multiple blocks of a first 

10 frame of the sequence of video frames in an 

11 intra-coded mode using a first orientation 

12 for said intra-coded blocks; and 

13 encoding multiple blocks of a second 

14 frame of the sequence of video frames in 

15 intra-coded mode using a second orientation 

16 for said intra-coded blocks, wherein said 

17 first orientation and said second orientation 

18 are perpendicular. 
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1 55. At least one program storage device readable by a 

2 machine, tangibly embodying at least one program of 

3 instructions executable by the machine to perform a method 

4 for processing a sequence of video frames, said method 

5 comprising : 

6 dynamically encoding said sequence of video frames 

7 to produce a pseudo-constant bits per frame compressed 

8 signal at a scene change within said sequence of video 

9 frames, said dynamically encoding comprising: 

10 detecting when a new scene occurs in the 

11 sequence of video frames; and 

12 responsive to said detecting, dynamically 

13 determining a group of frequency domain pixel data 

14 to be retained for a frame of the new scene. 

1 56. The at least one program storage device of claim 

2 55, wherein said dynamically determining comprises 

3 determining a level of frequency domain pixel data to be 

4 retained from multiple predefined levels, and wherein said 

5 determining determines the level of frequency domain pixel 

6 data to be retained for an initial frame of the new scene. 
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1 57. The at least one program storage device of claim 

2 56, wherein said level of frequency data to be retained is 

3 associated with a frequency constraining pattern, and said 

4 determining comprises selecting a frequency constraining 

5 pattern to be employed from a plurality of frequency 

6 constraining patterns associated with said multiple 

7 predefined levels. 

1 58. The at least one program storage device of claim 

2 57, wherein said plurality of frequency patterns comprise a 

3 common geometrical shape, and wherein said common 

4 geometrical shape of said plurality of frequency patterns 

5 can be one of a plurality of common geometrical shapes. 

1 59. The at least one program storage device of claim 

2 57, wherein at least one most significant frequency pixel is 

3 included by each of the plurality of frequency constraining 

4 patterns. 

1 60. The at least one program storage device of claim 

2 55, wherein said dynamically determining comprises 

3 determining said group of frequency domain pixel data to be 

4 retained for said frame of the new scene by evaluating 

5 picture difficulty of the new scene. 
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1 61. The at least one program storage device of claim 

2 60, wherein said dynamically determining further comprises 

3 ascertaining picture difficulty indicators representative of 

4 picture difficulty of the new scene, wherein said picture 

5 difficulty indicators are ascertained by computing pixel -to- 

6 pixel differences in at least some of horizontal, vertical, 

7 and diagonal directions. 

1 62. The at least one program storage device of claim 

2 61, wherein said picture difficulty indicators are 

3 ascertained by computing pixel-to-pixel differences in each 

4 of said horizontal, vertical, and diagonal directions. 

1 63 . The at least one program storage device of claim 

2 61, wherein said ascertaining comprises determining a 

3 maximum indicator for a frame picture of a progressive video 

4 source or a field picture of an interlaced video source, 

5 said maximum indicator being determined by comparing said 

6 picture difficulty indicators to each other. 
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1 64 . The at least one program storage device of claim 

2 61, wherein said ascertaining comprises ascertaining picture 

3 difficulty indicators in vertical and diagonal directions 

4 for both top and bottom fields of a frame of an interlaced 

5 video source, and picture difficulty indicators in vertical 

6 and diagonal directions for the frame of the interlaced 

7 video source, and wherein said ascertaining further 

8 comprises ascertaining field-based indicators in vertical 

9 and diagonal directions by computing a weighted summation of 

10 individual top and bottom field indicators having a same 

11 direction, and wherein for each vertical and diagonal 

12 direction, a picture difficulty indicator is determined by 

13 ascertaining a minium number between a corresponding field- 

14 based indicator and a frame-based indicator derived from the 

15 same frame of the interlaced video source. 

1 65. The at least one program storage device of claim 

2 64, wherein said ascertaining comprises selecting said 

3 picture difficulty indicators by determining a maximum 

4 indicator of the ascertained vertical and diagonal 

5 indicators, as well as a horizontal indicator. 

1 66. The at least one program storage device of claim 

2 63, wherein said ascertaining further comprises mapping the 

3 maximum indicator to a level of an n-level quantizer if the 

4 value of the maximum indicator is between predefined 

5 thresholds, and mapping the maximum indicator to a constant 

6 number if the indicator is outside of said predefined 

7 thresholds . 
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1 67. The at least one program storage device of claim 

2 63, further comprising employing said mapping to identify an 

3 address of a frequency pattern in a look-up table, said 

4 look-up table containing a plurality of frequency patterns, 

5 and wherein said determining comprises selecting one 

6 frequency pattern of said plurality of frequency patterns. 

1 68. The at least one program storage device of claim 

2 67, wherein when the maximum indicator has a large nominal 

3 value it is re-mapped into a frequency pattern comprising a 

4 lesser number of frequency coefficients than a number of 

5 coefficients of a frequency pattern corresponding to when 

6 the maximum indicator has a smaller nominal value. 

1 69. The at least one program storage device of claim 

2 68, wherein said plurality of frequency patterns are indexed 

3 such that a population of one frequency pattern is a subset 

4 of a population of a frequency pattern with a lower index 

5 number. 

1 70. The at least one program storage device of claim 

2 68, wherein said determining comprises comparing each 

3 frequency coefficient of a block with respect to said 

4 selected frequency pattern, and if the coefficient belongs 

5 to the frequency pattern, the coefficient is retained as 

6 part of said group of frequency domain pixel data. 
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1 71. The at least one program storage device of claim 

2 55, wherein after a final frame count, if the actual frame 

3 bits is smaller than the difference of a predefined number 

4 and a guard band value, the difference is computed and a 

5 number of zero bytes according to this difference is added 

6 to the final picture count to ensure said pseudo-constant 

7 bits per frame compressed signal. 

1 72 . The at least one program storage device of claim 

2 55, wherein said method is implemented within an MPEG 

3 encoder. 

1 73 . The at least one program storage device of claim 

2 55, wherein said dynamically encoding further comprises 

3 encoding said frame of the new scene as a intra-coded frame. 
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1 74 . At least one program storage device readable by a 

2 machine, tangibly embodying at least one program of 

3 instructions executable by the machine to perform a method 

4 of processing a sequence of video frames, said method 

5 comprising: 

6 dynamically encoding said sequence of video 

7 frames, said dynamically encoding comprising: 

8 encoding multiple blocks of a first frame of 

9 the sequence of video frames in intra-coded mode 

10 using a first orientation for said intra-coded 

11 blocks; and 

12 encoding multiple blocks of a second frame of 

13 the sequence of video frames in intra-coded mode 

14 using a second orientation for said intra-coded 

15 blocks, wherein said first orientation and said 

16 second orientation are perpendicular. 

1 75. The at least one program storage device of claim 

2 74, wherein said first orientation comprises a first 

3 diagonal orientation, and said second orientation comprises 

4 a second diagonal orientation. 

1 76. The at least one program storage device of claim 

2 75, wherein said first frame and said second frame comprise 

3 adjacent frames in said sequence of video frames. 
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1 77. The at least one program storage device of claim 

2 75, wherein said multiple intra-coded blocks of said first 

3 frame are scattered throughout said first frame, and wherein 

4 said multiple intra-coded blocks of said second frame are 

5 scattered throughout said second frame. 

1 78. The at least one program storage device of claim 

2 75, wherein said multiple intra-coded blocks of said first 

3 frame are equally spaced along a direction perpendicular to 

4 said first orientation of said multiple intra-coded blocks, 

5 and wherein said multiple intra-coded blocks of said second 

6 frame are equally spaced along a direction perpendicular to 

7 said second orientation of the multiple intra-coded blocks. 

1 79. The at least one program storage device of claim 

2 75, wherein said encoding said multiple blocks of said first 

3 frame in intra-coded mode using said first orientation 

4 comprises imposing different sub-sampling rates on said 

5 multiple intra-coded blocks along said first orientation, 

6 and wherein said encoding multiple blocks of said second 

7 frame in intra-coded mode using said second orientation 

8 comprises imposing different sub-sampling rates on said 

9 multiple intra-coded blocks along said second orientation. 
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1 80. The at least one program storage device of claim 

2 75, wherein said sequence of video frames comprises a 

3 plurality of even numbered frames and a plurality of odd 

4 numbered frames, and wherein said first frame comprises one 

5 frame of said plurality of even numbered frames and said 

6 second frame comprises one frame of said plurality of odd 

7 numbered frames, and wherein intra-coded blocks of said 

8 plurality of even numbered frames move along said first 

9 orientation in a direction opposite from that of intra-coded 

10 blocks moving along said second orientation within said 

11 plurality of odd numbered frames. 



* * * * * 
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METHOD AND APPARATUS FOR PRODUCING 
PSEUDO- CONSTANT BITS PER PICTURE VIDEO 
BIT- STREAMS FOR LOW-DELAY COMPRESSION SYSTEM 



Abstract of the Disclosure 

5 A frequency domain data management technique for 

producing pseudo- constant bits per picture compressed video 
bit -streams in a low delay digital encoding environment is 
presented. This technique forms a hierarchy among the 
localized samples of the picture in terms of frequency 
_10 importance and the picture difficulty after a shot-change is 

y;j detected. After a shot change, the data management technique 

l! implements a series of tasks composed of picture difficulty 

^ evaluation, frequency classification, frequency 

p constraining, and zero bytes generation to achieve a pre- 

n " : 15 determined average picture bits. Further, the low delay 

b k encoder uses a unique updating mechanism to encode certain 

L regions of the pictures in intra mode and ensures that the 

whole picture is updated after a pre-selected number of 
□ pictures. The updating method disseminates compression 

2 0 artifacts throughout the video stream by changing the 

orientation of the intra-coded regions for every picture and 
scatters intra-picture compression artifacts by spatially 
decimating the aforementioned regions at different rates. 
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BIT- STREAMS FOR LOW-DELAY COMPRESSION SYSTEM 
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Residence: 153 McFadden Road, Apalachin, New York 13732 

Citizenship: United States of America 

Post Office Address: 153 McFadden Road 

Apalachin, New York 13732 



Full Name of third joint inventor: Nader Mohsenian 

Signature: Yl Oh 1 JW^i^y^^ > Date: /0 1 2 (> f 00 

Residence: Ravens Crest Apartments, 57-20 Ravens Crest Drive, 
Plainsboro, New Jersey 08536 

Citizenship: Iran 

Post Office Address: Ravens Crest Apartments 

57-20 Ravens Crest Drive 
Plainsboro, New Jersey 08536 



